Skip to content

HBASE-29947 Improve CopyTable usage instructions for copying between secure and non-secure clusters#7829

Merged
NihalJain merged 3 commits intoapache:masterfrom
srinireddy2020:HBASE-29947
Mar 26, 2026
Merged

HBASE-29947 Improve CopyTable usage instructions for copying between secure and non-secure clusters#7829
NihalJain merged 3 commits intoapache:masterfrom
srinireddy2020:HBASE-29947

Conversation

@srinireddy2020
Copy link
Copy Markdown
Contributor

Improve CopyTable command usage to execute in secure clusters

Copy link
Copy Markdown
Contributor

@vaijosh vaijosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves CopyTable’s CLI help/usage output by adding additional examples aimed at running CopyTable across secure/insecure clusters (including different Kerberos realms).

Changes:

  • Add new CopyTable usage examples showing how to set authentication/principal-related settings when copying across clusters with different security modes.
  • Include example commands for secure→insecure, cross-realm secure→secure, and insecure→secure scenarios.
Comments suppressed due to low confidence (4)

hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java:260

  • These new examples rely on --peer.adr and -Dhbase.mapred.output.* overrides. Both are marked deprecated in the codebase (see CopyTable usage text for peer.adr, and TableOutputFormat.OUTPUT_CONF_PREFIX which is deprecated since 3.0.0 and slated for removal in 4.0.0). To keep the usage future-proof, please update the examples to use --peer.uri and pass any per-cluster overrides via the connection URI query parameters instead of hbase.mapred.output.*.
    System.err.println(
      " To copy the data of 'TestTable' between the secured cluster and insecure cluster-b");
    System.err.println(" $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable "
      + "-Dhbase.mapred.output.hbase.security.authentication=simple "
      + "--peer.adr=cluster-b-1.example.com,cluster-b-2.example.com,cluster-b-3.example.com:"
      + "2181:/cluster-b" + " TestTable");

hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java:257

  • The example description is ambiguous about direction (“between the secured cluster and insecure cluster-b”), but CopyTable reads from the local cluster and writes to the configured peer. Consider rewording to explicitly state source vs destination (and use consistent terminology like “secure/secured” and “insecure”).
    System.err.println(
      " To copy the data of 'TestTable' between the secured cluster and insecure cluster-b");
    System.err.println(" $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable "

hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java:265

  • Wording/grammar in this example header is a bit hard to parse (“between different realm secured cluster”) and “kerberos” should be capitalized as “Kerberos” in user-facing help text. Please rephrase for clarity (e.g., “between secured clusters in different Kerberos realms”).
    System.err.println(" To copy the data of 'TestTable' between different realm secured cluster.");
    System.err.println(" Assume cluster-b uses different kerberos principal, "
      + "cluster-b/_HOST@EXAMPLE.COM, for master and regionserver kerberos principal from another "
      + "cluster");

hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java:275

  • Similarly here, “between the insecure cluster and secured cluster-b” doesn’t clarify which side is the local cluster vs the configured peer. Since the command always writes to the peer cluster, consider rewording to make the direction explicit.
    System.err.println(
      " To copy the data of 'TestTable' between the insecure cluster and secured cluster-b");
    System.err.println(" $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable "

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@NihalJain NihalJain merged commit a4280d8 into apache:master Mar 26, 2026
@NihalJain
Copy link
Copy Markdown
Contributor

I took liberty to update jira/commit title to be more clear "Improve CopyTable usage instructions for copying between secure and non-secure clusters" FYI @srinireddy2020 and thank you for the fix. I will try to backport to other branches and will let you know in case they don't apply clean

@NihalJain NihalJain changed the title HBASE-29947 Improve CopyTable command usage to execute in secure clusters HBASE-29947 Improve CopyTable usage instructions for copying between secure and non-secure clusters Mar 26, 2026
NihalJain pushed a commit that referenced this pull request Mar 26, 2026
…secure and non-secure clusters (#7829)

Signed-off-by: Nihal Jain <nihaljain@apache.org>
Signed-off-by: Pankaj Kumar <pankajkumar@apache.org>
Reviewed-by: Vaibhav Joshi <vjoshi@cloudera.com>

(cherry picked from commit a4280d8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants